EDACluster: An Evolucionary Density and Grid-Based Clustering Algorithm
نویسندگان
چکیده
This paper presents EDACluster, an Estimation of Distribution Algorithm (EDA) applied to the clustering task. EDA is an Evolutionary Algorithm used here to optimize the search for adequate clusters when very little is known about the target dataset. The proposed algorithm uses a mixed approach – density and grid-based – to identify sets of dense cells in the dataset. The output is a list of items and their associated clusters. Items in low-density areas are considered noise and are not assigned to any cluster. This work uses four public domain datasets to perform the tests that compare EDACluster with DBSCAN, a conventional density-based clustering algorithm.
منابع مشابه
Improvement of density-based clustering algorithm using modifying the density definitions and input parameter
Clustering is one of the main tasks in data mining, which means grouping similar samples. In general, there is a wide variety of clustering algorithms. One of these categories is density-based clustering. Various algorithms have been proposed for this method; one of the most widely used algorithms called DBSCAN. DBSCAN can identify clusters of different shapes in the dataset and automatically i...
متن کاملبررسی مشکلات الگوریتم خوشه بندی DBSCAN و مروری بر بهبودهای ارائهشده برای آن
Clustering is an important knowledge discovery technique in the database. Density-based clustering algorithms are one of the main methods for clustering in data mining. These algorithms have some special features including being independent from the shape of the clusters, highly understandable and ease of use. DBSCAN is a base algorithm for density-based clustering algorithms. DBSCAN is able to...
متن کاملDENGRIS-Stream: A Density-Grid based Clustering Algorithm for Evolving Data Streams over Sliding Window
Evolving data streams are ubiquitous. Various clustering algorithms have been developed to extract useful knowledge from evolving data streams in real time. Density-based clustering method has the ability to handle outliers and discover arbitrary shape clusters whereas grid-based clustering has high speed processing time. Sliding window is a widely used model for data stream mining due to its e...
متن کاملAdjustable Probability Density Grid-Based Clustering for Uncertain Data Streams
Most existing traditional grid-based clustering algorithms for uncertain data streams that used the fixed meshing method have the disadvantage of low clustering accuracy. In view of above deficiencies, this paper proposes a novel algorithm APDG-CUStream, Adjustable Probability Density Grid-based Clustering for Uncertain Data Streams, which adopts the online component and offline component. In o...
متن کاملAn Axis-Shifted Grid-Clustering Algorithm
These spatial clustering methods can be classified into four categories: partitioning method, hierarchical method, density-based method and grid-based method. The grid-based clustering algorithm, which partitions the data space into a finite number of cells to form a grid structure and then performs all clustering operations to group similar spatial objects into classes on this obtained grid st...
متن کامل